Knowledge Discovery in Large Image Databases: Dealing with Uncertainties in Ground Truth

نویسندگان

  • Padhraic Smyth
  • Michael C. Burl
  • Usama M. Fayyad
  • Pietro Perona
چکیده

This paper discusses the problem of knowledge discovery in image databases with particular focus on the issues which arise when absolute ground truth is not available. The problem of searching the Magellan image data set in order to automatically locate and catalog small volcanoes on the planet Venus is used as a case study. In the absence of calibrated ground truth, planetary scientists provide subjective estimates of ground truth based on visual inspection of Magellan images. The paper discusses issues which arise in terms of elicitation of subjective probabilistic opinion, learning from probabilistic labels, and effective evaluation of both scientist and algorithm performance in the absence of ground truth. Data from the Magellan volcano detection project is used to illustrate the various techniques which we have developed to handle these issues. The primary conclusion of the paper is that knowledge discovery methodologies can be modified to handle lack of absolute ground truth provided the sources of uncertainty in the data are carefully handled.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

VascuSynth: Simulating vascular trees for generating volumetric image data with ground-truth segmentation and tree analysis

Automated segmentation and analysis of tree-like structures from 3D medical images are important for many medical applications, such as those dealing with blood vasculature or lung airways. However, there is an absence of large databases of expert segmentations and analyses of such 3D medical images, which impedes the validation and training of proposed image analysis algorithms. In this work, ...

متن کامل

Interpreting Image Databases by Region Classi cation

This paper addresses automatic interpretation of images of outdoor scenes. The method allows instances of objects from a number of generic classes to be identi ed: vegetation, buildings, vehicles, roads, etc., thereby enabling image databases to be queried on scene content. The feature set is based, in part, on psychophysical principles and includes measures of colour, texture and shape. Using ...

متن کامل

Automated knowledge discovery in advanced knowledge management

Knowledge Management is a discipline with many faces – among very provocative ones is the research area dealing with automatic discovery of the hidden truth within the data describing the world around us. The basic idea of knowledge discovery is to let the computer search for the knowledge whereas the humans give just broad directions about where and how to search. Surprisingly, it is often the...

متن کامل

بررسی کاربردهای داده کاوی در نظام سلامت

Introduction: Extensive amounts of data stored in medical databases require the development of specialized tools for accessing the data, data analysis, knowledge discovery, and the effective use of the data. Data mining is one of the most important methods. The article sketches the used Data Mining techniques, and illustrates their applicability to medical diagnostic and prognostic problems. ...

متن کامل

An exploration of diversified user strategies for image retrieval with relevance feedback

Given the difficulty of setting up large-scale experiments with real users, the comparison of content-based image retrieval methods using relevance feedback usually relies on the emulation of the user, following a single, well-prescribed strategy. Since the behavior of real users cannot be expected to comply to strict specifications, it is very important to evaluate the sensitiveness of the ret...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994